Providing structural bias for hierarchical clustering
نویسندگان
چکیده
Constrained clustering received a lot of attention in the last years. However, the widely used pairwise constraints are not generally applicable for hierarchical clustering, where the goal is to derive a cluster hierarchy instead of a flat partition. Therefore, we propose for the hierarchical setting—based on the ideas of pairwise constraints—the use of must-link-before (MLB) constraints. In this paper, we discuss their properties and present an algorithm that is able to create a hierarchy by considering these constraints directly. Furthermore, we propose an efficient data structure for its implementation and evaluate its effectiveness with different datasets in a text clustering scenario.
منابع مشابه
Integrating Declarative Knowledge in Hierarchical Clustering Tasks
The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsuper-vised learning task, clustering appears to be one of the tasks that more beneets might obtain from prior knowledge. In this paper, we propose a method for providing declarative prior knowledge to a hierarchical clustering system stressing the interactive component. Pre...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملA Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory
This paper discusses the clustering quality and complexities of the hierarchical data clustering algorithm based on gravity theory. The gravitybased clustering algorithm simulates how the given N nodes in a K-dimensional continuous vector space will cluster due to the gravity force, provided that each node is associated with a mass. One of the main issues studied in this paper is how the order ...
متن کاملHierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics
This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...
متن کاملبه کارگیری روشهای خوشهبندی در ریزآرایه DNA
Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...
متن کامل